Goto

Collaborating Authors

 Graham County


Mean-Field Limits for Two-Layer Neural Networks Trained with Consensus-Based Optimization

De Deyn, William, Herty, Michael, Samaey, Giovanni

arXiv.org Artificial Intelligence

Artificial Intelligence has witnessed remarkable progress over the past decades, both in its capabilities and its range of applications. Today, neural networks are present in a variety of fields. One classical application is function approximation, which is supported by the universal approximation theory [34]. In computer vision, convolutional neural networks form the backbone of most modern architectures [39, 38], while the framework of neural ordinary differential equations has contributed significantly to optimal control problems [17, 10]. In natural language processing and speech recognition, recurrent neural networks and the long short-term memory variants have yielded significant performance improvements [33, 51]. More recently, diffusion models have illustrated to be powerful generative models, with applications ranging from image denoising to video generation [56]. Neural networks have even found their way into scientific computing. The most notable example is physics-informed neural networks, which are capable of solving both forward and inverse problems governed by partial differential equations [50]. A neural network can be viewed, in general, as a function parametrized by a set of weights and biases, which we collectively refer to as parameters.


Gene Incremental Learning for Single-Cell Transcriptomics

Qi, Jiaxin, Cui, Yan, Huang, Jianqiang, Xie, Gaogang

arXiv.org Artificial Intelligence

Classes, as fundamental elements of Computer Vision, have been extensively studied within incremental learning frameworks. In contrast, tokens, which play essential roles in many research fields, exhibit similar characteristics of growth, yet investigations into their incremental learning remain significantly scarce. This research gap primarily stems from the holistic nature of tokens in language, which imposes significant challenges on the design of incremental learning frameworks for them. To overcome this obstacle, in this work, we turn to a type of token, gene, for a large-scale biological dataset--single-cell transcriptomics--to formulate a pipeline for gene incremental learning and establish corresponding evaluations. We found that the forgetting problem also exists in gene incremental learning, thus we adapted existing class incremental learning methods to mitigate the forgetting of genes. Through extensive experiments, we demonstrated the soundness of our framework design and evaluations, as well as the effectiveness of our method adaptations. Finally, we provide a complete benchmark for gene incremental learning in single-cell transcriptomics.



Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Zuo, Jingwei, Velikanov, Maksim, Chahed, Ilyas, Belkada, Younes, Rhayem, Dhia Eddine, Kunsch, Guillaume, Hacid, Hakim, Yous, Hamza, Farhat, Brahim, Khadraoui, Ibrahim, Farooq, Mugariya, Campesan, Giulia, Cojocaru, Ruxandra, Djilali, Yasser, Hu, Shi, Chaabane, Iheb, Khanna, Puneesh, Seddik, Mohamed El Amine, Huynh, Ngoc Dung, Khac, Phuc Le, AlQadi, Leen, Mokeddem, Billel, Chami, Mohamed, Abubaker, Abdalgader, Lubinets, Mikhail, Piskorski, Kacper, Frikha, Slim

arXiv.org Artificial Intelligence

In this report, we introduce Falcon-H1, a new series of large language models (LLMs) featuring hybrid architecture designs optimized for both high performance and efficiency across diverse use cases. Unlike earlier Falcon models built solely on Transformer or Mamba architectures, Falcon-H1 adopts a parallel hybrid approach that combines Transformer-based attention with State Space Models (SSMs), known for superior long-context memory and computational efficiency. We systematically revisited model design, data strategy, and training dynamics, challenging conventional practices in the field. Falcon-H1 is released in multiple configurations, including base and instruction-tuned variants at 0.5B, 1.5B, 1.5B-deep, 3B, 7B, and 34B parameters. Quantized instruction-tuned models are also available, totaling over 30 checkpoints on Hugging Face Hub. Falcon-H1 models demonstrate state-of-the-art performance and exceptional parameter and training efficiency. The flagship Falcon-H1-34B matches or outperforms models up to 70B scale, such as Qwen3-32B, Qwen2.5-72B, and Llama3.3-70B, while using fewer parameters and less data. Smaller models show similar trends: the Falcon-H1-1.5B-Deep rivals current leading 7B-10B models, and Falcon-H1-0.5B performs comparably to typical 7B models from 2024. These models excel across reasoning, mathematics, multilingual tasks, instruction following, and scientific knowledge. With support for up to 256K context tokens and 18 languages, Falcon-H1 is suitable for a wide range of applications. All models are released under a permissive open-source license, underscoring our commitment to accessible and impactful AI research.


SHARP: Synthesizing High-quality Aligned Reasoning Problems for Large Reasoning Models Reinforcement Learning

Wu, Xiong Jun, Zhang, Zhenduo, Wen, ZuJie, Zhang, Zhiqiang, Ren, Wang, Shi, Lei, Chen, Cai, Zhao, Deng, Wang, Qing, Han, Xudong, Tang, Chengfu, Jin, Dingnan, Cui, Qing, Zhou, Jun

arXiv.org Artificial Intelligence

Training large reasoning models (LRMs) with reinforcement learning in STEM domains is hindered by the scarcity of high-quality, diverse, and verifiable problem sets. Existing synthesis methods, such as Chain-of-Thought prompting, often generate oversimplified or uncheckable data, limiting model advancement on complex tasks. To address these challenges, we introduce SHARP, a unified approach to Synthesizing High-quality Aligned Reasoning Problems for LRMs reinforcement learning with verifiable rewards (RLVR). SHARP encompasses a strategic set of self-alignment principles -- targeting graduate and Olympiad-level difficulty, rigorous logical consistency, and unambiguous, verifiable answers -- and a structured three-phase framework (Alignment, Instantiation, Inference) that ensures thematic diversity and fine-grained control over problem generation. We implement SHARP by leveraging a state-of-the-art LRM to infer and verify challenging STEM questions, then employ a reinforcement learning loop to refine the model's reasoning through verifiable reward signals. Experiments on benchmarks such as GPQA demonstrate that SHARP-augmented training substantially outperforms existing methods, markedly improving complex reasoning accuracy and pushing LRM performance closer to expert-level proficiency. Our contributions include the SHARP strategy, framework design, end-to-end implementation, and experimental evaluation of its effectiveness in elevating LRM reasoning capabilities.


Simulation-informed deep learning for enhanced SWOT observations of fine-scale ocean dynamics

Cutolo, Eugenio, Granero-Belinchon, Carlos, Thiraux, Ptashanna, Wang, Jinbo, Fablet, Ronan

arXiv.org Machine Learning

Oceanic processes at fine scales are crucial yet difficult to observe accurately due to limitations in satellite and in-situ measurements. The Surface Water and Ocean Topography (SWOT) mission provides high-resolution Sea Surface Height (SSH) data, though noise patterns often obscure fine scale structures. Current methods struggle with noisy data or require extensive supervised training, limiting their effectiveness on real-world observations. We introduce SIMPGEN (Simulation-Informed Metric and Prior for Generative Ensemble Networks), an unsupervised adversarial learning framework combining real SWOT observations with simulated reference data. SIMPGEN leverages wavelet-informed neural metrics to distinguish noisy from clean fields, guiding realistic SSH reconstructions. Applied to SWOT data, SIMPGEN effectively removes noise, preserving fine-scale features better than existing neural methods. This robust, unsupervised approach not only improves SWOT SSH data interpretation but also demonstrates strong potential for broader oceanographic applications, including data assimilation and super-resolution.


JUREX-4E: Juridical Expert-Annotated Four-Element Knowledge Base for Legal Reasoning

Liu, Huanghai, Huang, Quzhe, Chen, Qingjing, Hu, Yiran, Ma, Jiayu, Liu, Yun, Shen, Weixing, Feng, Yansong

arXiv.org Artificial Intelligence

The Four-Element Theory is a fundamental framework in criminal law, defining the constitution of crime through four dimensions: Subject, Object, Subjective aspect, and Objective aspect. This theory is widely referenced in legal reasoning, and many Large Language Models (LLMs) attempt to incorporate it when handling legal tasks. However, current approaches rely on LLMs' internal knowledge to incorporate this theory, often lacking completeness and representativeness. To address this limitation, we introduce JUREX-4E, an expert-annotated knowledge base covering 155 criminal charges. It is structured through a progressive hierarchical annotation framework that prioritizes legal source validity and employs diverse legal interpretation methods to ensure comprehensiveness and authority. We evaluate JUREX-4E on the Similar Charge Distinction task and apply it to Legal Case Retrieval, demonstrating its effectiveness in improving LLM performance. Experimental results validate the high quality of JUREX-4E and its substantial impact on downstream legal tasks, underscoring its potential for advancing legal AI applications. Code: https://github.com/THUlawtech/JUREX


LegalBench.PT: A Benchmark for Portuguese Law

Canaverde, Beatriz, Pires, Telmo Pessoa, Ribeiro, Leonor Melo, Martins, André F. T.

arXiv.org Artificial Intelligence

The recent application of LLMs to the legal field has spurred the creation of benchmarks across various jurisdictions and languages. However, no benchmark has yet been specifically designed for the Portuguese legal system. In this work, we present LegalBench.PT, the first comprehensive legal benchmark covering key areas of Portuguese law. To develop LegalBench.PT, we first collect long-form questions and answers from real law exams, and then use GPT-4o to convert them into multiple-choice, true/false, and matching formats. Once generated, the questions are filtered and processed to improve the quality of the dataset. To ensure accuracy and relevance, we validate our approach by having a legal professional review a sample of the generated questions. Although the questions are synthetically generated, we show that their basis in human-created exams and our rigorous filtering and processing methods applied result in a reliable benchmark for assessing LLMs' legal knowledge and reasoning abilities. Finally, we evaluate the performance of leading LLMs on LegalBench.PT and investigate potential biases in GPT-4o's responses. We also assess the performance of Portuguese lawyers on a sample of questions to establish a baseline for model comparison and validate the benchmark.


MirrorCBO: A consensus-based optimization method in the spirit of mirror descent

Bungert, Leon, Hoffmann, Franca, Kim, Doh Yeon, Roith, Tim

arXiv.org Artificial Intelligence

In this work we propose MirrorCBO, a consensus-based optimization (CBO) method which generalizes standard CBO in the same way that mirror descent generalizes gradient descent. For this we apply the CBO methodology to a swarm of dual particles and retain the primal particle positions by applying the inverse of the mirror map, which we parametrize as the subdifferential of a strongly convex function $\phi$. In this way, we combine the advantages of a derivative-free non-convex optimization algorithm with those of mirror descent. As a special case, the method extends CBO to optimization problems with convex constraints. Assuming bounds on the Bregman distance associated to $\phi$, we provide asymptotic convergence results for MirrorCBO with explicit exponential rate. Another key contribution is an exploratory numerical study of this new algorithm across different application settings, focusing on (i) sparsity-inducing optimization, and (ii) constrained optimization, demonstrating the competitive performance of MirrorCBO. We observe empirically that the method can also be used for optimization on (non-convex) submanifolds of Euclidean space, can be adapted to mirrored versions of other recent CBO variants, and that it inherits from mirror descent the capability to select desirable minimizers, like sparse ones. We also include an overview of recent CBO approaches for constrained optimization and compare their performance to MirrorCBO.


The Use of Readability Metrics in Legal Text: A Systematic Literature Review

Han, Yu, Ceross, Aaron, Bergmann, Jeroen H. M.

arXiv.org Artificial Intelligence

Understanding the text in legal documents can be challenging due to their complex structure and the inclusion of domain-specific jargon. Laws and regulations are often crafted in such a manner that engagement with them requires formal training, potentially leading to vastly different interpretations of the same texts. Linguistic complexity is an important contributor to the difficulties experienced by readers. Simplifying texts could enhance comprehension across a broader audience, not just among trained professionals. Various metrics have been developed to measure document readability. Therefore, we adopted a systematic review approach to examine the linguistic and readability metrics currently employed for legal and regulatory texts. A total of 3566 initial papers were screened, with 34 relevant studies found and further assessed. Our primary objective was to identify which current metrics were applied for evaluating readability within the legal field. Sixteen different metrics were identified, with the Flesch-Kincaid Grade Level being the most frequently used method. The majority of studies (73.5%) were found in the domain of "informed consent forms". From the analysis, it is clear that not all legal domains are well represented in terms of readability metrics and that there is a further need to develop more consensus on which metrics should be applied for legal documents.